Defining and Measuring Voice Quality

نویسندگان

  • Jody Kreiman
  • Diana Vanlancker-Sidtis
  • Bruce R. Gerratt
چکیده

Although voices provide listeners with significant information about speakers, defining and measuring voice quality remain elusive goals. We argue that the much-maligned ANSI standard definition of sound quality is in fact an appropriate definition, because it treats quality as the result of a perceptual process rather than a fixed quantity, and highlights the interaction between listeners and signals in determining quality in the context of specific perceptual goals. Which aspects of the signal are important will depend on the task, the characteristics of the stimuli, the listener’s background, perceptual habits, and so on. Given the many kinds of information listeners extract from voice signals, it is not surprising that these characteristics vary from task to task, voice to voice, and listener to listener. Application of speech synthesis in method-of-adjustment tasks allows measurement of quality psychoacoustically as those aspects of the signal that allow a listener to determine that two sounds of equal pitch and loudness are different, and holds promise for improving the reliability and validity of measures of voice quality. 1. What is voice? The definitional dilemma The speaking voice naturally conveys information about the speaking individual. The impressions listeners gain from voices are not necessarily accurate, but nevertheless voice quality serves as a primary means by which speakers project their identity—their “physical, psychological, and social characteristics”—to the world [1]. The measurement of vocal quality thus plays an important role in many disciplines, and topics related to the perception and measurement of voice quality have implications for fields ranging from evolutionary biology to music to law enforcement to medicine. Such topics encompass much of human existence, and indicate how central voice quality is to us. It has proven difficult to provide a single, useful, allpurpose definition of voice, in part because of the broad range of functions voice subserves. As Sundberg noted [2], everyone knows what voice is until they try to pin it down, and several senses of the term are in common use. Definitions of voice fall into two general classes. Voice can be narrowly defined as “sound produced by vibration of the vocal folds,” excluding the effects of vocal tract resonances, vocal tract excitation from turbulent noise, and everything else that occurs during speech production. Because anatomical constraints make it difficult to study voice as narrowly defined, most authors adopt the practical expedient of controlling for the effects of the vocal tract on voice by restricting voice samples to steady state vowels (usually /a/). This practice allows experimenters to study natural-sounding phonation, while holding non-laryngeal factors constant. This approach is the most common implementation of narrow definitions of voice. Voice can also be broadly defined as essentially synonymous with speech. Besides details of phonatory quality, factors such as articulatory details, pitch and amplitude variations, and temporal patterning all contribute to how a speaker “sounds”. Broad definitions of voice reflect this fact, and generally portray voice as the result of a complex sequence of cognitive, physiological, aerodynamic, and acoustic events. The information that complete voice patterns convey (more or less successfully) about affect, attitude, psychological state, pragmatics, grammatical function, sociological status, and personal identity emerges from this complex enfolding of phonatory, phonetic, and temporal detail. Precisely which stage in this chain of events receives definitional focus depends on the interest of the practitioner or experimenter, or on the task faced by the listener. For example, surgeons typically approach voice in terms of physiological function, with secondary concern for the exact perceived quality that results from phonation. Engineers are often interested in the acoustic waveform that correlates with vocal sound, and therefore define voice in terms of acoustic attributes. In contrast, psychologists are not especially interested in how the voice is physically produced, but instead define voice in terms of what a listener hears. Defining voice quality is as problematic as defining voice. The overall quality (or timbre) of a sound is traditionally defined as "that attribute of auditory sensation in terms of which a listener can judge that two sounds similarly presented and having the same loudness and pitch are dissimilar" [4]. By this definition, quality is multidimensional, including the spectral envelope and its changes in time, fluctuations of amplitude and fundamental frequency, and the extent to which the signal is periodic or aperiodic [5]. This large number of degrees of freedom makes it difficult to operationalize the concept of quality, particularly across tasks. According to the ANSI definition, quality is a perceptual response in the particular task of determining that two sounds are dissimilar, and it is unclear how this definition might generalize to other common, seemingly-related tasks like recognizing a speaker or evaluating a single stimulus. Evidence [6] also suggests that quality may not be independent of frequency and amplitude, as the ANSI definition seemingly requires. Finally, this definition is essentially negative: It states that quality is not pitch and loudness, but does not indicate what it VOQUAL'03, Geneva, August 27-29, 2003

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice Analysis in English and Persian Persuasive Texts: Pedagogical implications in focus

The main purpose of this study is to investigate how voice is realized by Iranian EFL learners in persuasive English and Persian text types. This discourse-related notion is a required criterion for writing acceptable English. However, L2 learners from cultures other than English might face problems in realizing it, or even ignore it all through their writing. In this connection, the present st...

متن کامل

Voice Analysis in English and Persian Persuasive Texts: Pedagogical implications in focus

The main purpose of this study is to investigate how voice is realized by Iranian EFL learners in persuasive English and Persian text types. This discourse-related notion is a required criterion for writing acceptable English. However, L2 learners from cultures other than English might face problems in realizing it, or even ignore it all through their writing. In this connection, the present st...

متن کامل

Voice in Short Argumentative Texts Written by Undergraduate Learners of English

The present study explored the intensity level of authorial voice in relation to the quality of argumentative writing. 42 undergraduate learners of English as a foreign language (36 girls and 6 boys) spent 45 minutes to individually complete in-class position-taking writing tasks for three weeks. Their overall academic writing quality scores assigned based on portfolio assessment were studied i...

متن کامل

Effect of Functional Endoscopic Sinus Surgery on the Voice Quality among Patients with Rhinosinus Polyposis

Introduction: Rhinosinus polyposis is associated with voice quality reduction. There has been little evidence about the efficacy of rhinosinus polyps surgery on patients' voice quality so far. The aim of the present study was to evaluate the nasality and acoustic voice changes after rhinosinus polyposis surgery.   Materials and Methods: The population in this study compo...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003